Evaluation of a Cross-lingual Romanian-English Multi-document Summariser

نویسندگان

  • Constantin Orasan
  • Oana Andreea Chiorean
چکیده

The rapid growth of the Internet means that more information is available than ever before. Multilingual multi-document summarisation offers a way to access this information even when it is not in a language spoken by the reader by extracting the gist from related documents and translating it automatically. This paper presents an experiment in which Maximal Marginal Relevance (MMR), a well known multi-document summarisation method, is used to produce summaries from Romanian news articles. A task-based evaluation performed on both the original summaries and on their automatically translated versions reveals that they still contain a significant portion of the important information from the original texts. However, direct evaluation of the automatically translated summaries shows that they are not very legible and this can put off some readers who want to find out more about a topic.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using a Keyness Metric for Single and Multi Document Summarisation

In this paper we show the results of our participation in the MultiLing 2013 summarisation tasks. We participated with single-document and multi-document corpus-based summarisers for both Arabic and English languages. The summarisers used word frequency lists and log likelihood calculations to generate single and multi document summaries. The single and multi summaries generated by our systems ...

متن کامل

Finding the Best Approach for Multi-lingual Text Summarisation: A Comparative Analysis

This paper addresses the problem of multilingual text summarisation. The goal is to analyse three approaches for generating summaries in four languages (English, Spanish, German and French), in order to determine the best one to adopt when tackling this issue. The proposed approaches rely on: i) language-independent techniques; ii) language-specific resources; and iii) machine translation resou...

متن کامل

Multi-document multilingual summarization and evaluation tracks in ACL 2013 MultiLing Workshop

The MultiLing 2013 Workshop of ACL 2013 posed a multi-lingual, multidocument summarization task to the summarization community, aiming to quantify and measure the performance of multi-lingual, multi-document summarization systems across languages. The task was to create a 240–250 word summary from 10 news articles, describing a given topic. The texts of each topic were provided in 10 languages ...

متن کامل

Language engineering for syntactic knowledge transfer

In this paper we present a method for an English-Romanian treebank construction, together with the obtained evaluation results. The treebank is built upon a parallel English-Romanian corpus word-aligned and annotated at the morphological and syntactic level. The syntactic trees of the Romanian texts are generated by considering the syntactic phrases of the English parallel texts automatically r...

متن کامل

Cross-Lingual Romanian to English Question Answering at CLEF 2006

This paper describes the development of a Question Answering (QA) system and its evaluation results in the Romanian-English cross-lingual track organized as part of the CLEF 2006 campaign. The development stages of the cross-lingual Question Answering system are described incrementally throughout the paper, at the same time pinpointing the problems that occurred and the way they were addressed....

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008